Smooth Loss Functions for Deep Top-k Classification

نویسندگان

  • Leonard Berrada
  • Andrew Zisserman
  • M. Pawan Kumar
چکیده

The top-k error is a common measure of performance in machine learning and computer vision. In practice, top-k classification is typically performed with deep neural networks trained with the cross-entropy loss. Theoretical results indeed suggest that cross-entropy is an optimal learning objective for such a task in the limit of infinite data. In the context of limited and noisy data however, the use of a loss function that is specifically designed for top-k classification can bring significant improvements. Our empirical evidence suggests that the loss function must be smooth and have non-sparse gradients in order to work well with deep neural networks. Consequently, we introduce a family of smoothed loss functions that are suited to top-k optimization via deep learning. The widely used cross-entropy is a special case of our family. Evaluating our smooth loss functions is computationally challenging: a naı̈ve algorithm would require O( ( n k ) ) operations, where n is the number of classes. Thanks to a connection to polynomial algebra and a divideand-conquer approach, we provide an algorithm with a time complexity of O(kn). Furthermore, we present a novel approximation to obtain fast and stable algorithms on GPUs with single floating point precision. We compare the performance of the cross-entropy loss and our margin-based losses in various regimes of noise and data size, for the predominant use case of k = 5. Our investigation reveals that our loss is more robust to noise and overfitting than cross-entropy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis and Optimization of Loss Functions for Multiclass, Top-k, and Multilabel Classification

Top-k error is currently a popular performance measure on large scale image classification benchmarks such as ImageNet and Places. Despite its wide acceptance, our understanding of this metric is limited as most of the previous research is focused on its special case, the top-1 error. In this work, we explore two directions that shed light on the top-k error. First, we provide an in-depth analy...

متن کامل

Learning Deep Features for One-Class Classification

We propose a deep learning-based solution for the problem of feature learning in one-class classification. The proposed method operates on top of a Convolutional Neural Network (CNN) of choice and produces descriptive features while maintaining a low intra-class variance in the feature space for the given class. For this purpose two loss functions, compactness loss and descriptiveness loss are ...

متن کامل

Deep Learning using Support Vector Machines

Recently, fully-connected and convolutional neural networks have been trained to reach state-of-the-art performance on a wide variety of tasks such as speech recognition, image classification, natural language processing, and bioinformatics data. For classification tasks, much of these “deep learning” models employ the softmax activation functions to learn output labels in 1-of-K format. In thi...

متن کامل

A novel method based on a combination of deep learning algorithm and fuzzy intelligent functions in order to classification of power quality disturbances in power systems

Automatic classification of power quality disturbances is the foundation to deal with power quality problem. From the traditional point of view, the identification process of power quality disturbances should be divided into three independent stages: signal analysis, feature selection and classification. However, there are some inherent defects in signal analysis and the procedure of manual fe...

متن کامل

Learning with Average Top-k Loss

In this work, we introduce the average top-k (ATk) loss as a new ensemble loss for supervised learning, which is the average over the k largest individual losses over a training dataset. We show that the ATk loss is a natural generalization of the two widely used ensemble losses, namely the average loss and the maximum loss, but can combines their advantages and mitigate their drawbacks to bett...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1802.07595  شماره 

صفحات  -

تاریخ انتشار 2018